The security of artificial intelligence (AI) is an important research area towards safe, reliable, and trustworthy AI systems. To accelerate the research on AI security, the Artificial Intelligence Security Competition (AISC) was organized by the Zhongguancun Laboratory, China Industrial Control Systems Cyber Emergency Response Team, Institute for Artificial Intelligence, Tsinghua University, and RealAI as part of the Zhongguancun International Frontier Technology Innovation Competition (https://www.zgc-aisc.com/en). The competition consists of three tracks, including Deepfake Security Competition, Autonomous Driving Security Competition, and Face Recognition Security Competition. This report will introduce the competition rules of these three tracks and the solutions of top-ranking teams in each track.
translated by 谷歌翻译
Homography estimation is erroneous in the case of large-baseline due to the low image overlay and limited receptive field. To address it, we propose a progressive estimation strategy by converting large-baseline homography into multiple intermediate ones, cumulatively multiplying these intermediate items can reconstruct the initial homography. Meanwhile, a semi-supervised homography identity loss, which consists of two components: a supervised objective and an unsupervised objective, is introduced. The first supervised loss is acting to optimize intermediate homographies, while the second unsupervised one helps to estimate a large-baseline homography without photometric losses. To validate our method, we propose a large-scale dataset that covers regular and challenging scenes. Experiments show that our method achieves state-of-the-art performance in large-baseline scenes while keeping competitive performance in small-baseline scenes. Code and dataset are available at https://github.com/megvii-research/LBHomo.
translated by 谷歌翻译
Graph neural networks have achieved significant success in representation learning. However, the performance gains come at a cost; acquiring comprehensive labeled data for training can be prohibitively expensive. Active learning mitigates this issue by searching the unexplored data space and prioritizing the selection of data to maximize model's performance gain. In this paper, we propose a novel method SMARTQUERY, a framework to learn a graph neural network with very few labeled nodes using a hybrid uncertainty reduction function. This is achieved using two key steps: (a) design a multi-stage active graph learning framework by exploiting diverse explicit graph information and (b) introduce label propagation to efficiently exploit known labels to assess the implicit embedding information. Using a comprehensive set of experiments on three network datasets, we demonstrate the competitive performance of our method against state-of-the-arts on very few labeled data (up to 5 labeled nodes per class).
translated by 谷歌翻译
Most existing Spiking Neural Network (SNN) works state that SNNs may utilize temporal information dynamics of spikes. However, an explicit analysis of temporal information dynamics is still missing. In this paper, we ask several important questions for providing a fundamental understanding of SNNs: What are temporal information dynamics inside SNNs? How can we measure the temporal information dynamics? How do the temporal information dynamics affect the overall learning performance? To answer these questions, we estimate the Fisher Information of the weights to measure the distribution of temporal information during training in an empirical manner. Surprisingly, as training goes on, Fisher information starts to concentrate in the early timesteps. After training, we observe that information becomes highly concentrated in earlier few timesteps, a phenomenon we refer to as temporal information concentration. We observe that the temporal information concentration phenomenon is a common learning feature of SNNs by conducting extensive experiments on various configurations such as architecture, dataset, optimization strategy, time constant, and timesteps. Furthermore, to reveal how temporal information concentration affects the performance of SNNs, we design a loss function to change the trend of temporal information. We find that temporal information concentration is crucial to building a robust SNN but has little effect on classification accuracy. Finally, we propose an efficient iterative pruning method based on our observation on temporal information concentration. Code is available at https://github.com/Intelligent-Computing-Lab-Yale/Exploring-Temporal-Information-Dynamics-in-Spiking-Neural-Networks.
translated by 谷歌翻译
Existing correspondence datasets for two-dimensional (2D) cartoon suffer from simple frame composition and monotonic movements, making them insufficient to simulate real animations. In this work, we present a new 2D animation visual correspondence dataset, AnimeRun, by converting open source three-dimensional (3D) movies to full scenes in 2D style, including simultaneous moving background and interactions of multiple subjects. Our analyses show that the proposed dataset not only resembles real anime more in image composition, but also possesses richer and more complex motion patterns compared to existing datasets. With this dataset, we establish a comprehensive benchmark by evaluating several existing optical flow and segment matching methods, and analyze shortcomings of these methods on animation data. Data, code and other supplementary materials are available at https://lisiyao21.github.io/projects/AnimeRun.
translated by 谷歌翻译
Deep learning methods have contributed substantially to the rapid advancement of medical image segmentation, the quality of which relies on the suitable design of loss functions. Popular loss functions, including the cross-entropy and dice losses, often fall short of boundary detection, thereby limiting high-resolution downstream applications such as automated diagnoses and procedures. We developed a novel loss function that is tailored to reflect the boundary information to enhance the boundary detection. As the contrast between segmentation and background regions along the classification boundary naturally induces heterogeneity over the pixels, we propose the piece-wise two-sample t-test augmented (PTA) loss that is infused with the statistical test for such heterogeneity. We demonstrate the improved boundary detection power of the PTA loss compared to benchmark losses without a t-test component.
translated by 谷歌翻译
In this work, we present a dense tracking and mapping system named Vox-Fusion, which seamlessly fuses neural implicit representations with traditional volumetric fusion methods. Our approach is inspired by the recently developed implicit mapping and positioning system and further extends the idea so that it can be freely applied to practical scenarios. Specifically, we leverage a voxel-based neural implicit surface representation to encode and optimize the scene inside each voxel. Furthermore, we adopt an octree-based structure to divide the scene and support dynamic expansion, enabling our system to track and map arbitrary scenes without knowing the environment like in previous works. Moreover, we proposed a high-performance multi-process framework to speed up the method, thus supporting some applications that require real-time performance. The evaluation results show that our methods can achieve better accuracy and completeness than previous methods. We also show that our Vox-Fusion can be used in augmented reality and virtual reality applications. Our source code is publicly available at https://github.com/zju3dv/Vox-Fusion.
translated by 谷歌翻译
非接触式粒子操纵(NPM)技术将人类的分析能力大大扩展到了微观和纳米量表,这反过来又大大促进了材料科学和生命科学的发展。尽管从机器人的角度来看,通过电力,磁性和光场取得了巨大的成功,但它仍然是劳动密集型操作,因为在早期准备阶段,专业人力援助以某种方式是强制性的。因此,出现运动颗粒的自动非接触夹捕获是值得的,特别是对于粒子样品罕见,脆弱或接触敏感的应用。利用最新的动态声场调节技术,尤其是通过从微尺度到亚中心尺度的声学操纵的巨大可扩展性,我们提出了一个自动化的非接触式微粒诱捕,该非接触式捕获具有超声梯级系统和显微镜系统和显微镜系统的移动微粒本文的视觉。据我们所知,这项工作的主要贡献是首次通过诉诸机器人方法来实现声学NPM场中完全自动化的微颗粒捕获。简而言之,通过参考其计算和生成的声学陷阱区域来观察并通过双眼微观视觉系统观察并预测粒子的移动状态。在这项工作中,非连接机器人最终效应器的手眼关系问题也解决了。实验证明了这项工作的有效性。
translated by 谷歌翻译
随机且未知的散射介质背后的对象的分类为计算成像和机器视野字段的具有挑战性的任务。最新的基于深度学习的方法证明了使用图像传感器收集的扩散器延伸模式对对象进行分类。这些方法需要使用在数字计算机上运行的深神经网络进行相对大规模的计算。在这里,我们提出了一个全光处理器,使用单个像素检测到的宽带照明通过未知的随机相扩散器直接对未知对象进行分类。使用深度学习进行了优化的一组传播衍射层,形成了一个物理网络,该物理网络全面地绘制了随机扩散器后面输入对象的空间信息,以进入通过单个像素在输出平面上检测到的输出光的功率谱,衍射网络。我们在数值上使用宽带辐射通过随机新扩散器对未知手写数字进行分类,在训练阶段从未使用过,并实现了88.53%的盲目测试准确性。这种通过随机扩散器的单像素全光对象分类系统基于被动衍射层,该层可以通过简单地缩放与波长范围的衍射范围来缩放衍射特征,从而在电磁光谱的任何部分中运行,并且可以在电磁光谱的任何部分工作。这些结果在例如生物医学成像,安全性,机器人技术和自动驾驶中具有各种潜在的应用。
translated by 谷歌翻译
动机:癌症是异质的,影响了个性化治疗的精确方法。准确的亚型可以导致癌症患者的生存率更好。高通量技术为癌症亚型提供了多个OMIC数据。但是,由于OMICS数据的大量和高维度,精确的癌症亚型仍然具有挑战性。结果:这项研究提出了基于MLP和变压器块的深度学习方法拟议的亚型形式,以提取多摩学数据的低维表示。 K-均值和共识聚类也用于获得准确的亚型结果。我们比较了TCGA 10癌症类型的其他最先进的亚型方法。我们发现,基于生存分析,亚型形式可以在5000多个肿瘤的基准数据集上表现更好。此外,亚型形式还取得了泛滥亚型的出色结果,这可以帮助分析分子水平上各种癌症类型的共同点和差异。最后,我们将亚型格式应用于TCGA 10类型的癌症。我们确定了50种基本生物标志物,可用于研究靶向癌症药物并促进精密医学时代的癌症治疗。
translated by 谷歌翻译